Goto

Collaborating Authors

 interpersonal relationship


Culturally-Aware Conversations: A Framework & Benchmark for LLMs

Havaldar, Shreya, Rai, Sunny, Cho, Young-Min, Ungar, Lyle

arXiv.org Artificial Intelligence

Existing benchmarks that measure cultural adaptation in LLMs are misaligned with the actual challenges these models face when interacting with users from diverse cultural backgrounds. In this work, we introduce the first framework and benchmark designed to evaluate LLMs in realistic, multicultural conversational settings. Grounded in sociocultural theory, our framework formalizes how linguistic style - a key element of cultural communication - is shaped by situational, relational, and cultural context. We construct a benchmark dataset based on this framework, annotated by culturally diverse raters, and propose a new set of desiderata for cross-cultural evaluation in NLP: conversational framing, stylistic sensitivity, and subjective correctness. We evaluate today's top LLMs on our benchmark and show that these models struggle with cultural adaptation in a conversational setting.


Psychologically Enhanced AI Agents

Besta, Maciej, Chandran, Shriram, Gerstenberger, Robert, Lindner, Mathis, Chrapek, Marcin, Martschat, Sebastian Hermann, Ghandi, Taraneh, Iff, Patrick, Niewiadomski, Hubert, Nyczyk, Piotr, Müller, Jürgen, Hoefler, Torsten

arXiv.org Artificial Intelligence

We introduce MBTI-in-Thoughts, a framework for enhancing the effectiveness of Large Language Model (LLM) agents through psychologically grounded personality conditioning. Drawing on the Myers-Briggs Type Indicator (MBTI), our method primes agents with distinct personality archetypes via prompt engineering, enabling control over behavior along two foundational axes of human psychology, cognition and affect. We show that such personality priming yields consistent, interpretable behavioral biases across diverse tasks: emotionally expressive agents excel in narrative generation, while analytically primed agents adopt more stable strategies in game-theoretic settings. Our framework supports experimenting with structured multi-agent communication protocols and reveals that self-reflection prior to interaction improves cooperation and reasoning quality. To ensure trait persistence, we integrate the official 16Personalities test for automated verification. While our focus is on MBTI, we show that our approach generalizes seamlessly to other psychological frameworks such as Big Five, HEXACO, or En-neagram. By bridging psychological theory and LLM behavior design, we establish a foundation for psychologically enhanced AI agents without any fine-tuning.


Does chat change LLM's mind? Impact of Conversation on Psychological States of LLMs

Choi, Junhyuk, Hong, Yeseon, Kim, Minju, Kim, Bugeun

arXiv.org Artificial Intelligence

The recent growth of large language models (LLMs) has enabled more authentic, human-centered interactions through multi-agent systems. However, investigation into how conversations affect the psychological states of LLMs is limited, despite the impact of these states on the usability of LLM-based systems. In this study, we explored whether psychological states change during multi-agent interactions, focusing on the effects of conversation depth, topic, and speaker. We experimentally investigated the behavior of 10 LLMs in open-domain conversations. We employed 14 questionnaires and a topic-analysis method to examine the behavior of LLMs across four aspects: personality, interpersonal relationships, motivation, and emotion. The results revealed distinct psychological trends influenced by conversation depth and topic, with significant variations observed between different LLM families and parameter sizes.


Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench

Huang, Jen-tse, Wang, Wenxuan, Li, Eric John, Lam, Man Ho, Ren, Shujie, Yuan, Youliang, Jiao, Wenxiang, Tu, Zhaopeng, Lyu, Michael R.

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have recently showcased their remarkable capacities, not only in natural language processing tasks but also across diverse domains such as clinical medicine, legal consultation, and education. LLMs become more than mere applications, evolving into assistants capable of addressing diverse user requests. This narrows the distinction between human beings and artificial intelligence agents, raising intriguing questions regarding the potential manifestation of personalities, temperaments, and emotions within LLMs. In this paper, we propose a framework, PsychoBench, for evaluating diverse psychological aspects of LLMs. Comprising thirteen scales commonly used in clinical psychology, PsychoBench further classifies these scales into four distinct categories: personality traits, interpersonal relationships, motivational tests, and emotional abilities. Our study examines five popular models, namely text-davinci-003, gpt-3.5-turbo, gpt-4, LLaMA-2-7b, and LLaMA-2-13b. Additionally, we employ a jailbreak approach to bypass the safety alignment protocols and test the intrinsic natures of LLMs. We have made PsychoBench openly accessible via https://github.com/CUHK-ARISE/PsychoBench.


Predicting the Quality of Revisions in Argumentative Writing

Liu, Zhexiong, Litman, Diane, Wang, Elaine, Matsumura, Lindsay, Correnti, Richard

arXiv.org Artificial Intelligence

The ability to revise in response to feedback is critical to students' writing success. In the case of argument writing in specific, identifying whether an argument revision (AR) is successful or not is a complex problem because AR quality is dependent on the overall content of an argument. For example, adding the same evidence sentence could strengthen or weaken existing claims in different argument contexts (ACs). To address this issue we developed Chain-of-Thought prompts to facilitate ChatGPT-generated ACs for AR quality predictions. The experiments on two corpora, our annotated elementary essays and existing college essays benchmark, demonstrate the superiority of the proposed ACs over baselines.


Teens are turning to 'My AI' for mental health support -- which doctors warn against

FOX News

A geography professor shared his method to detect AI-generated plagiarism with Fox News. He developed it after noticing that ChatGPT produced fake citations. Anyone who uses Snapchat now has free access to My AI, the app's built-in artificial intelligence chatbot, first released as a paid feature in February. In addition to serving as a chat companion, the bot can also have some practical purposes, such as offering gift-buying advice, planning trips, suggesting recipes and answering trivia questions, according to Snap. However, while it's not billed as a source of medical advice, some teens have turned to My AI for mental health support -- something many medical experts caution against.


A Picture May Be Worth a Thousand Lives: An Interpretable Artificial Intelligence Strategy for Predictions of Suicide Risk from Social Media Images

Badian, Yael, Ophir, Yaakov, Tikochinski, Refael, Calderon, Nitay, Klomek, Anat Brunstein, Reichart, Roi

arXiv.org Artificial Intelligence

The promising research on Artificial Intelligence usages in suicide prevention has principal gaps, including black box methodologies, inadequate outcome measures, and scarce research on non-verbal inputs, such as social media images (despite their popularity today, in our digital era). This study addresses these gaps and combines theory-driven and bottom-up strategies to construct a hybrid and interpretable prediction model of valid suicide risk from images. The lead hypothesis was that images contain valuable information about emotions and interpersonal relationships, two central concepts in suicide-related treatments and theories. The dataset included 177,220 images by 841 Facebook users who completed a gold-standard suicide scale. The images were represented with CLIP, a state-of-the-art algorithm, which was utilized, unconventionally, to extract predefined features that served as inputs to a simple logistic-regression prediction model (in contrast to complex neural networks). The features addressed basic and theory-driven visual elements using everyday language (e.g., bright photo, photo of sad people). The results of the hybrid model (that integrated theory-driven and bottom-up methods) indicated high prediction performance that surpassed common bottom-up algorithms, thus providing a first proof that images (alone) can be leveraged to predict validated suicide risk. Corresponding with the lead hypothesis, at-risk users had images with increased negative emotions and decreased belonginess. The results are discussed in the context of non-verbal warning signs of suicide. Notably, the study illustrates the advantages of hybrid models in such complicated tasks and provides simple and flexible prediction strategies that could be utilized to develop real-life monitoring tools of suicide.


Towards Inter-character Relationship-driven Story Generation

Vijjini, Anvesh Rao, Brahman, Faeze, Chaturvedi, Snigdha

arXiv.org Artificial Intelligence

In this paper, we introduce the task of modeling interpersonal relationships for story generation. For addressing this task, we propose Relationships as Latent Variables for Story Generation, (ReLiSt). ReLiSt generates stories sentence by sentence and has two major components - a relationship selector and a story continuer. The relationship selector specifies a latent variable to pick the relationship to exhibit in the next sentence and the story continuer generates the next sentence while expressing the selected relationship in a coherent way. Our automatic and human evaluations demonstrate that ReLiSt is able to generate stories with relationships that are more faithful to desired relationships while maintaining the content quality. The relationship assignments to sentences during inference bring interpretability to ReLiSt.


AI won't replace psychologists, it will only make them better

#artificialintelligence

The app store was flooded with chatbots and automated digital programs, based on machine learning and artificial intelligence (AI), and designed to provide a therapeutic solution to common human problems such as stress, depression, and anxiety. These apps have been marketed as the next hot thing that would replace the need to attend treatment and open up to a therapist. But if we learned anything from the Covid-19 pandemic, it is that interpersonal relationships are a fundamental human need. Therapists have reported a threefold increase in demand for psychological treatment since the onset of the pandemic. Psychologists, more than ever, are inundated by referrals and increased distress in their clients.


Robot Take-Over

#artificialintelligence

Some people are worried about robots taking our jobs. Not just kind of worried, but really worried that this turning point in our society could be the start to popular dystopian sci-fi. These people aren't foolish luddites, they're educated members of academic society who want to contribute to the ongoing vision of our collective future. There is every right for concern, but it doesn't have to be scary. It's a change that's going to happen regardless and we better be ready for it.

  interpersonal relationship, robot take-over